Adapting ADtrees for High Arity Features

نویسندگان

  • Robert Van Dam
  • Irene Langkilde-Geary
  • Dan Ventura
چکیده

ADtrees, a data structure useful for caching sufficient statistics, have been successfully adapted to grow lazily when memory is limited and to update sequentially with an incrementally updated dataset. For low arity symbolic features, ADtrees trade a slight increase in query time for a reduction in overall tree size. Unfortunately, for high arity features, the same technique can often result in a very large increase in query time and a nearly negligible tree size reduction. In the dynamic (lazy) version of the tree, both query time and tree size can increase for some applications. Here we present two modifications to the ADtree which can be used separately or in combination to achieve the originally intended space-time tradeoff in the ADtree when applied to datasets containing very high arity features.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adapting Adtrees for Improved Performance on Large Datasets with High Arity

ADAPTING ADTREES FOR IMPROVED PERFORMANCE ON LARGE DATASETS WITH HIGH ARITY FEATURES Robert Van Dam Department of Computer Science Master of Science The ADtree, a data structure useful for caching sufficient statistics, has been successfully adapted to grow lazily when memory is limited and to update sequentially with an incrementally updated dataset. However, even these modified forms of the A...

متن کامل

ADtrees for Fast Counting and for Fast Learning of Association Rules

The problem of discovering association rules in large databases has received considerable research attention. Much research has examined the exhaustive discovery of all association rules involving positive binary literals (e.g. Agrawal et al. 1996). Other research has concerned finding complex association rules for high-arity attributes such as CN2 (Clark and Niblett 1989). Complex association ...

متن کامل

Rewriting Numeric Constraint Satisfaction Problems for Consistency Algorithms

Reformulating constraint satisfaction problems (CSPs) in lower arity is a common procedure when computing consistency. Lower arity CSPs are simpler to treat than high arity CSPs. Several consistency algorithms have exponential complexity in the CSP’s arity, others only work on low

متن کامل

Arities of Symmetry Breaking Constraints

Static symmetry breaking is a well-established technique to speed up the solving process of symmetric Constraint Satisfaction Programs (csps). Static symmetry breaking suffers from two inherent problems: symmetry breaking constraints come in great numbers and are of high arity. Here, we consider the problem of high arity. We prove that not even for binary csps can we always reduce the arity of ...

متن کامل

Sub-tree Swapping Crossover and Arity Histogram Distributions

Recent theoretical work has characterised the search bias of GP subtree swapping crossover in terms of program length distributions, providing an exact fixed point for trees with internal nodes of identical arity. However, only an approximate model (based on the notion of average arity) for the mixed-arity case has been proposed. This leaves a particularly important gap in our knowledge because...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008